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1. INTRODUCTION 

Nowadays, with the rapid development of technology, anything can be done more practically and 
easily. The positive impact of the internet in the development of IT actually brings many benefits. However, 
there is a negative impact on its development. For example, when spreading false news or expressing hatred. 
By using the web, it can strengthen ideas and opinions in a group or individual. It's mean the hate speech also 
can find fertile ground on certain websites and social media [1]. 

European right-wing and racist movements, often use digital space as a preferred place to spread 
their message and recruit new members [2, 3]. They often spread racism on the web and also contribute in 
spreading the image of Islam that’s different with the values of western. This could lead to physical actions 
against Muslims [4]. Islamophobia comes from the words "Islam" and "phobia" which means "fear of 
something about Islam" and it shows a view of Islam which will threaten the culture of western. If we trace 
back to Europe 14 centuries ago, that was one among the results of the "orientalism" of the Arab world. 
In recent times, the term islamophobia has begun to reappear in connection with the "war on terror" due to 
the terrorist attacks that took place on September 11, 2001, which have negative social perceptions about 
Islam [5]. There is a study of islamophobia conducted by the British association [6] which aims to fight 
anti-Muslim crime and also find that victims usually experience harassment, both online and offline. 
Recently, world leaders and human rights institutions assessed the horrific terror attacks that occurred at 
the twin mosque in Christchurch, New Zealand, where terrorists have killed 49 people and injured at least 
48 others, are the effects of allowing an attitude of hatred towards Islam (Islamophobia) in the west [7]. 
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According to [8], in hate speech online there’s a difference from offline abuse due to the internet 
there’s anonymity, relatively cheap and easy access, lack of physical presence, and especially spontaneity. 
One branch of research that later developed from the information explosion situation on the internet is 
sentiment analysis. Sentiment analysis or also called opinion mining is a field of study that analyzes 
sentiments, opinions, attitudes, emotions, evaluations, and judgments people's about a service, product, 
individual, organization, event, topic, attribute, and problem [9]. One of the Social media like Twitter allows 
users to be able to send messages in real-time. On the other hand, data mining is a method for finding 
knowledge in databases to find useful knowledge from data. This is the process of getting attractive, 
serviceable designs and relationships in large volumes of data. Thus data mining can be used for sentiment 
analysis because it has a large amount of Twitter data. 

As in previous research that discusses sentiment analysis with various themes such as Islamophobia 
on far-right UK political parties, Islamophobic behavior after #ParisAttacks, election result predictions, 
e-commerce, consumer reviews and algorithm evaluations. In that research, Naive Bayes and the support 
vector machine (SVM) algorithm have the best average classification results in sentiment analysis. 
In the election result prediction research, they applied valence aware dictionary and sentiment reasoner 
(VADER) to labeling the training dataset. And also, if there is a data imbalance, in the evaluation research 
sentiment analysis algorithm uses the synthetic minority oversampling technique (SMOTE) to balance it. 

In this research, will be discussed about how to do sentiment analysis derived from Twitter user 
tweets about Islamophobia. Ideas and public opinion via Twitter in large numbers, at least be able to analyze 
globally about the sentiments of Islamophobia towards Muslims around the world on the day of the news 
happening mosque Christchurch attack in New Zealand. Based on the results of several previous research, 
the analysis to be carried out is by utilizing hybrid sentiment analysis using the lexical based method and 
supervised machine learning. The lexical based method uses VADER will be applied for automatic labeling 
of crawling data from Twitter. The supervised machine learning method was chosen to test the results of 
automatic labeling using VADER and the Naive Bayes and SVM algorithm will be compared. The addition 
of feature selection techniques from SMOTE was also added to balance the labeling data from VADER so as 
to improve the test results of the algorithm. 


2. RESEARCH METHOD 

There are several studies on Islamophobic and The Use of Twitter’s Data, as follows: “Detecting 
weak and strong Islamophobic hate speech on social media" build multi-class classifiers that distinguish 
between non-Islamophobic, weak Islamophobic and strong Islamophobic content [10], “Predicting online 
Islamophobic behavior after #ParisAttacks” in this research collected tweets related to the Paris attacks 
within 50 hours after the event [11], "Election result prediction using Twitter sentiment analysis” the training 
dataset, for labeling, using valence aware dictionary and sentiment reasoner (VADER) Next use two 
algorithms, multinomial Naive Bayes and SVM [12], “Sentiment analysis about e-commerce from tweet 
using decision tree , k-nearest neighbor, and Naive Bayes” use rapidminer to make sentiment analysis by 
comparing the Decision Tree, K-NN, and Naive Bayes Classifier and using 10-Fold Cross validation to 
evaluate the performance of the machine learning model and get the highest results from Naive Bayes [13], 
“Analisis Sentimen Pada Review Konsumen Menggunakan Metode Naive Bayes Dengan Seleksi Fitur Chi 
Square Untuk Rekomendasi Lokasi Makanan Tradisional” using feature selection the chi-square and 
classification process using the Naive Bayes method [14], “An evaluation of SVM and Naive Bayes with 
SMOTE on sentiment analysis data set” determine the factors involving the classification of SVM and Naive 
Bayes in sentiment classification of problems using SMOTE in the dataset, also comparing the use of 10-fold 
cross validation with 70:30 split in the test [15]. 

In this research, we have to know about Islamophobia comes from the words "Islam" and "phobia" 
which means "fear of something about Islam" and it shows a view of Islam which will threaten the culture of 
Western. If we trace back to Europe 14 centuries ago, that was one among the results of the "orientalism" of 
the Arab world. In recent times, the term islamophobia has begun to reappear in connection with the "war on 
terror" due to the terrorist attacks that took place on September 11, 2001, which have negative social 
perceptions about Islam [5]. That was confirmed by negative media representatives about Islam. Although far 
from a new form of fanaticism, Islamophobia can have different nuances depending on the socio-political 
context: [16] for example, explaining that American Islamophobia is influenced by events in Europe such as 
Brexit, but at the same time is connected to the legal system, national politics, racial demographics and US 
religion. Despite the lack of a universal understanding of the term Islamophobia, the definition of 
Islamophobia given by the Runnymede Trust is "baseless hostility towards Islam" and explains hostility in 
unfair discrimination against the Muslim community and individuals, by excluding Muslims from social 
affairs and political. That was confirmed by negative media representatives about Islam [1]. 
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There is a study of islamophobia conducted by the British association [6] which aims to fight 
anti-Muslim crime and also find that victims usually experience harassment, both online and offline. 
Recently, world leaders and human rights institutions assessed the horrific terror attacks that occurred at 
the twin mosque in Christchurch, New Zealand, where terrorists have killed 49 people and injured at least 
48 others, are the effects of allowing an attitude of hatred towards Islam (Islamophobia) in the west [7]. 

One of the Social media like Twitter allows users to be able to send messages in real-time. 
This message is popularly known as a tweet. A Tweet is a short message with a limited character length of 
only 140 characters. Due to the limitations of characters that can be written, a tweet often contains 
abbreviations, slang and spelling errors [17]. All data will be processed for finding meaningful relationships, 
patterns, and trends by examining the large data sets stored in storage by using pattern recognition techniques 
such as statistical and mathematical techniques [18]. 

The use of Twitter’s data usage text, the method to mine data is text mining which is defined 
defined as the process of extracting implicit knowledge from data. Because implicit knowledge which is 
the output of the text does not exist in the given storage, it must be distinguished from information obtained 
from storage [19]. After that we can use to analyze people's sentiments, opinions, attitudes, emotions, 
evaluations, and judgments people's about a service, product, individual, organization, event, topic, attribute, 
and problem. There are also many slightly different names and tasks, for example, sentiment analysis, 
Opinion mining, opinion extraction, sentiment mining, subjectivity analysis, affect analysis, emotion analysis, 
review mining, etc. Sentiment analysis and opinion mining mainly focuses on opinions that express or imply 
positive or negative sentiments, which is called by text mining [9]. There are ways to classify sentiments, 
including [20]: machine learning, lexicon-based, and hybrid. 

For labelling for text labeling, we use the VADER tool which is available online and can be used in 
the Python programming language. VADER is a rule and a lexicon-based sentiment analysis tool, that's 
specifically aligned with the sentiment expressed on the social media. This is an intensity sentiment polarizer 
developed by Hutto and Gilbert. VADER takes sentences as input and gives percent values for three positive, 
neutral and negative and compound categories (polarity of whole sentences) [12]. Regarding scores, 
the combined scores are calculated by adding up the valence scores of each word in the lexicon, adjusted 
according to the rules, and then normalized to between -1 (the most extreme negative) and +1 (the most 
extreme positive). It’s the most useful matrix if we want a single sentiment measure and unidimensional 
for a particular sentence. Calling 'normalized and weighted composite scores’ accurate. The general threshold 
value (used in the literature cited on this page) is [21]: positive sentiment, neutral sentiments, and 
negative sentiment. 

Because the data is not balance, the method for balancing is synthetic minority oversampling 
technique (SMOTE). The SMOTE algorithm performs the Over Sampling approach to balance the original 
training set. Instead of implementing simple replication of minority class instances, the main idea of SMOTE 
is to introduce synthetic data examples. This new data is created by interpolating between several instances 
of minority classes that are in the specified environment [22]. 

The algorithm for determining the accuracy of predictions uses the classification algorithm 
commonly used, namely: Naive Bayes and SVM. Naive Bayes is a classification with probability and 
statistical methods that predict future opportunities based on past experience so that it is known as the Bayes 
theorem [14, 23], and SVM algorithm will identify hyperplane (or a series of hyperplane) that will provide 
the best separation between training data instances. Among all possible hyperplanes of separation, the SVM 
algorithm tries to identify who has the greatest distance to the closest training instance of any class, because 
that will be reflected in generalization errors that are lower than the classifier [24, 25]. Both processing 
will be evaluated using cross-validation, which is an approach for evaluating the performance of machine 
learning models. There are two types of phases in the dataset, mainly training devices and testing devices. 
With the application of cross-validation, the testing set will be compared with the training set to validate 
overfitting and to decide how the machine learning model will produce independent data [15]. The best 
classification performance can be described precisely by a tool called confusion matrix and traditionally it’s 
arranged as 2x2 [26]. 


2.1. Metodology 

The method used to obtain information is to use Text mining in order to be able to classify tweets 
relating to Islamophobia when Christchurch attack. The research methodology used in this research is to use 
the cross-industry standard process for data mining (CRISP-DM) method [27] which consists of six stages 
and will be described in Figure 1. 
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Figure 1. CRISP-DM model 


2.1.1. Bussiness understanding 

Research conducted to find out public sentiments in responding to "Islamophobia" on the day of 
the attack on the mosque in Christchurch, New Zealand which emerged from social media Twitter on March 
15, 2019, the implementation was carried out using the python programming language along with 
the required libraries. 


2.1.2. Data understanding 

Using data from Twitter social media with the Islamophobia hashtag and not filtering only into one 
language or region, but retrieving data from tweets around the world. Taken on the day of the attack on 
the mosque in Christchurch, New Zealand on 15 March 2019. This was done so that the data taken from 
the Twitter discussion that occurred was still relevant to the incident of the attack on Islamophobia. The data 
is retrieved using data-miner.io. "Data scraper" application, which is an additional plug-in in the Chrome 
browser which functions for scraping data from HTML web pages and exports into XLS, CSV, XLSX or 
TSV files. 

By using "Data scraper" managed to collect tweets from each user and obtained as many as 
3115 tweets, data taken from Twitter consists of several columns namely username, fullname, date, first 
hashtags, alt hashtags, tweets (message text), retweet, retweet (retweet (retweet (retweet) w/text), likes, likes 
(w/text), and URL path. The only column used in this research is "tweet (message text)". 


2.1.3. Data preparation 

There are 2 stages of data preparation in this reasearch, namely preprocessing phase 1 and phase 2. 
Preprocessing phase 1 is carried out after crawling Twitter data, which consists of case folding, cleansing and 
translate into English. Whereas for Phase 2 Preprocessing is carried out after getting data labeling from 
VADER, which consists of Tokenize, N-Gram, Stopwords and Stemming. Stage 2 processing is carried out 
in the TF-IDF process which functions to do word weighting. 


2.1.4. Modeling 

After doing the data preparation process in the previous stage, at this stage, the data will be used as 
the input classification algorithm. The process is carried out in cross-validation with 10-Fold, which is doing 
10 times cross-validation using data that has been processed before. With the application of cross-validation, 
the testing set will be compared with the training set to validate overfitting and to decide how the model will 
produce independent data. 

In this research, two types of algorithms will be used for comparison, namely Naive Bayes 
and SVM. As is known, there is a significant difference from the results of labeling using VADER between 
the positive and negative classes. So to handle the Imbalanced Data that occurs, the SMOTE is used in 
the comparison of use with the two algorithms. 
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2.1.5. Evaluation 

The evaluation phase is the stage of evaluating the performance of the modeling and calculations 
that have been carried out by measuring the performance of the classification algorithm used. In this research, 
performance is measured using accuracy, precision, recall, Fl-score, area under curve (AUC) which will 
be displayed in the form of ROC curves and wordcloud to visualize the spread of positive and negative 
words that usually arise from TF-IDF. These results will show the use of the most appropriate algorithm for 
the modeling that has been proposed previously. 


2.1.6. Deployment 

The next stage is the application of the model by developing into the application. The application of 
the model is based on the best evaluation value obtained at the evaluation stage. The use of applications that 
will be developed using the python programming language along with the required libraries. 


3. RESULTS AND ANALYSIS 
3.1. VADER result 

The data obtained using the VADER library are positive, negative and neutral. In this research, 
the data used are only positive and negative sentiment data from VADER to carry out the next process. Total 
data after preprocessing and labeling using VADER can be seen in the image as shown in Figure 2. 
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Figure 2. Label distribution of data using VADER 


3.2. Evaluation model result 

The results of the processing of algorithms that have been completed have the results and graphs of 
ROC and AUC for each algorithm. The Naive Bayes algorithm, as shown in Figure 3, has an AUC value of 
0.588, which means poor classification. Whereas the SVM algorithm, in Figure 4, has an AUC value of 0.708 
which means good classification. In Figure 5, the Naive Bayes plus SMOTE algorithm has an AUC value of 
0.908 and in Figure 6, for the SVM plus SMOTE algorithm which has an AUC value of 0.914 which means 
excelent classification. 
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Figure 3. Confusion matrix and kurve ROC Naive Bayes 
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Figure 6. Confusion matrix and kurve ROC SVM with SMOTE 


Based on the results of the model tests in Figure 6, here is a summary of the test results for all 
the models. Based on the Figure 7 results, it can be seen that the SVM + SMOTE model has the highest 
accuracy, precision, Fl-Score and AUC and the fastest model processing time is in the SVM. From these 
results, it can be concluded that the SMOTE feature is very influential in improving the performance of both 
algorithms. In this research, it can also be concluded that the results of the SVM + SMOTE model have better 
performance compared to other testing models. 
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Figure 7. Comparison results of performance model 
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3.3. Wordcloud 

Wordcloud is a tool for visualizing the spread of keywords that often appear in a sentence. 
In this case, the spread of positive and negative words that usually arises from TF-IDF. The Figure 8 
wordcloud is obtained based on the total weight of the TF-IDF results that have been run when preprocessing 
for positive and negative sentences. 
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Figure 8. Wordcloud from positive and negative sentences 


terror 


3.4. Application 

Based on the evaluation of test results and model validation using the Naive Bayes and SVM 
without SMOTE, also Naive Bayes and SVM with SMOTE. It was found that the results of testing 
the highest model of all is SVM with SMOTE. Therefore, the model will be used for deployment into 
the application based on the results test from SVM with SMOTE. The following is a flowchart model for 
deployment application, on the Figure 9. Explanation of picture at the Figure 9 is starts from opening the 
application and inputting the text you want to analyze the sentiment. After that, it will enter to preprocessing 
phase 1 which include case folding, remove URL, remove mention, remove hashtag, convert emotion, 
convert emotion, remove number and punctuation and the last one is translate to English. After going through 
preprocessing stage 1, processing stage 2 will consists of Tokenize, Filter by Length, N-Gram (Tri-gram), 
Stopwords and stemming. The next process is a load data model stage that has been previously saved after 
that is make a prediction to determine the text input is positive or negative. For comparison, VADER is used 
to label sentiment analysis based on the scores. 
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Figure 9. Flowchart deployment model 
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The application is built using the python programming language along with the required libraries. 
Furthermore, the application that has been used based on the Figure 9 flowchart model has been built 
on: https://analisa-sentimen.herokuapp.com/. For example, when analyzing sentiments, the inputted text 
shows the results when translated, the results after cleansing, the results after being preprocessed, the weight 
distribution of words from the inputted text and finally the results of sentiment analysis from VADER and 
SVM with SMOTE. The picture in the Figure 10 is a comparison of the results of the sentiment analysis 
evaluation conducted when modeling. 
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Figure 10. Comparison of sentiment analysis results 


4. CONCLUSION 

Provide a statement that what is expected, as stated in the "introduction" chapter can ultimately 
result in "results and discussion" chapter, so there is compatibility. Moreover, it can also be added 
the prospect of the development of research results and application prospects of further studies into the next 
(based on result and discussion). After conducting research through several stages, conclusions can be drawn 
to answer according to the formulation of the problem in conducting sentiment analysis of 2936 tweet data 
from the results of automatic labeling using VADER that SVM with over sampling imbalanced data using 
SMOTE is proven to have the highest performance value and takes short of processing time compared to 
Naive Bayes algorithm, Naive Bayes with SMOTE and SVM in the classification of tweets about 
Islamophobia when Christchurch attack. 

It is hoped that future research can add classification methods for comparison with other supervised 
learning methods in order to obtain even better performance. The application of other features can be 
compared like the down sampling method, PSO, Chi Square, or anything else. The application uses other 
corpus-based for automatic data labeling such as TextBlob, spaCy, or anything else. As well as 
the application use of other libraries for translating such as google clound translate to be able to read 
the Unicode data type format. 
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